The Lasso under Heteroscedasticity
نویسندگان
چکیده
Lasso is a popular method for variable selection in regression. Much theoretical understanding has been obtained recently on its model selection or sparsity recovery properties under sparse and homoscedastic linear regression models. Since these standard model assumptions are often not met in practice, it is important to understand how Lasso behaves under nonstandard model assumptions. In this paper, we study the sign consistency of the Lasso under one such model where the variance of the noise scales linearly with the expectation of the observation. This sparse Poisson-like model is motivated by medical imaging. In addition to studying the sign consistency, we also give sufficient conditions for `∞ consistency. With theoretical and simulation studies, we provide conditions for when the Lasso should not be expected to be sign consistent. One interesting finding is that β∗ can not be spread out. Precisely, for both deterministic design and random Gaussian design, the sufficient conditions for the Lasso to be sign consistent require ‖β∗‖2/[M(β∗)]2 to be not too big, where M(β∗) is the smallest nonzero element of |β∗|. By special designs of X, we show that ‖β∗‖2/[M(β∗)]2 = o(n) is almost necessary. For Positron Emission Tomography (PET), this suggests that when there are dense areas of the positron emitting substance, less dense areas are not well detected by the Lasso; this is of particular concern when imaging tumors; the periphery of the tumor will produce a much weaker signal than the center, leading to a big ‖β∗‖2/[M(β∗)]2. We compare the sign consistency of the Lasso under the Poisson-like model to its sign consistency on the standard model which assumes the noise is homoscedastic. The comparison shows that when β∗ is spread out, the Lasso performs worse for data from the Poisson-like model than those from the standard model, confirming our theoretical findings.
منابع مشابه
Pivotal estimation via square-root Lasso in nonparametric regression
We propose a self-tuning √ Lasso method that simultaneously resolves three important practical problems in high-dimensional regression analysis, namely it handles the unknown scale, heteroscedasticity and (drastic) non-Gaussianity of the noise. In addition, our analysis allows for badly behaved designs, for example, perfectly collinear regressors, and generates sharp bounds even in extreme case...
متن کاملForecasting Wind Power – Modeling Periodic and Non-linear Effects Under Conditional Heteroscedasticity
In this article we present an approach that enables joint wind speed and wind power forecasts for a wind park. We combine a multivariate seasonal time varying threshold autoregressive moving average (TVARMA) model with a power threshold generalized autoregressive conditional heteroscedastic (power-TGARCH) model. The modeling framework incorporates diurnal and annual periodicity modeling by peri...
متن کاملThe Lasso under Poisson-like Heteroscedasticity
The performance of the Lasso is well understood under the assumptions of the standard sparse linear model with homoscedastic noise. However, in several applications, the standard model does not describe the important features of the data. This paper examines how the Lasso performs on a non-standard model that is motivated by medical imaging applications. In these applications, the variance of t...
متن کاملAdaptive Lasso and group-Lasso for functional Poisson regression
High dimensional Poisson regression has become a standard framework for the analysis of massive counts datasets. In this work we estimate the intensity function of the Poisson regression model by using a dictionary approach, which generalizes the classical basis approach, combined with a Lasso or a group-Lasso procedure. Selection depends on penalty weights that need to be calibrated. Standard ...
متن کاملA structural model on a hypercube represented by optimal transport
We propose a flexible statistical model for high-dimensional quantitative data on a hypercube. Our model, called the structural gradient model (SGM), is based on a one-to-one map on the hypercube that is a solution for an optimal transport problem. As we show with many examples, SGM can describe various dependence structures including correlation and heteroscedasticity. The maximum likelihood e...
متن کامل